Alternative Parallelization Strategies in EST Clustering

نویسندگان

  • Nishank Trivedi
  • Kevin T. Pedretti
  • Terry A. Braun
  • Todd E. Scheetz
  • Thomas L. Casavant
چکیده

One of the fundamental components of large-scale gene discovery projects is that of clustering of Expressed Sequence Tags (ESTs) from complementary DNA (cDNA) clone libraries. Clustering is used to create non-redundant catalogs and indices of these sequences. In particular, clustering of ESTs is frequently used to estimate the number of genes derived from cDNA-based gene discovery efforts. This paper presents a novel parallel extension to an EST clustering program, UIcluster4, that incorporates alternative splicing information and a new parallelization strategy. The results are compared to other parallelized EST clustering systems in terms of overall processing time and in accuracy of the resulting clustering.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gene transcript clustering: a comparison of parallel approaches

One of the fundamental components of large-scale gene discovery projects is that of clustering of expressed sequence tags (ESTs) from complementary DNA (cDNA) clone libraries. Clustering is used to create non-redundant catalogs and indices of these sequences. In particular, clustering of ESTs is frequently used to estimate the number of genes derived from cDNAbased gene discovery efforts. This ...

متن کامل

ECgene: genome-based EST clustering and gene modeling for alternative splicing.

With the availability of the human genome map and fast algorithms for sequence alignment, genome-based EST clustering became a viable method for gene modeling. We developed a novel gene-modeling method, ECgene (Gene modeling by EST Clustering), which combines genome-based EST clustering and the transcript assembly procedure in a coherent and consistent fashion. Specifically, ECgene takes altern...

متن کامل

Parallelization of Multi-objective Evolutionary Algorithms Using Clustering Algorithms

While Single-Objective Evolutionary Algorithms (EAs) parallelization schemes are both well established and easy to implement, this is not the case for Multi-Objective Evolutionary Algorithms (MOEAs). Nevertheless, the need for parallelizing MOEAs arises in many real-world applications, where fitness evaluations and the optimization process can be very time consuming. In this paper, we test the ...

متن کامل

An overview of the wcd EST clustering tool

UNLABELLED The wcd system is an open source tool for clustering expressed sequence tags (EST) and other DNA and RNA sequences. wcd allows efficient all-versus-all comparison of ESTs using either the d(2) distance function or edit distance, improving existing implementations of d(2). It supports merging, refinement and reclustering of clusters. It is 'drop in' compatible with the StackPack clust...

متن کامل

Parallelizing single patch pass clustering

Clustering algorithms such as k-means, the self-organizing map (SOM), or Neural Gas (NG) constitute popular tools for automated information analysis. Since data sets are becoming larger and larger, it is vital that the algorithms perform efficient for huge data sets. Here we propose a parallelization of patch neural gas which requires only a single run over the data set and which can work with ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003